Constructing and Evaluating Controlled Bilingual Terminologies

نویسندگان

  • Rei Miyata
  • Kyo Kageura
چکیده

This paper presents the construction and evaluation of Japanese and English controlled bilingual terminologies that are particularly intended for controlled authoring and machine translation with special reference to the Japanese municipal domain. Our terminologies are constructed by extracting terms from municipal website texts, and the term variations are controlled by defining preferred and proscribed terms for both the source Japanese and the target English. To assess the coverage of the terms/concepts in the municipal domain and validate the quality of the control, we employ a quantitative extrapolation method that estimates the potential vocabulary size. Using Large-Number-of-Rare-Event (LNRE) modelling, we compare two parameters: (1) uncontrolled and controlled and (2) Japanese and English. The results show that our terminologies currently cover about 45–65% of the terms and 50–65% of the concepts in the municipal domain, and are well controlled. The detailed analysis of growth patterns of terminologies also provides insight into the extent to which we can enlarge the terminologies within the realistic range.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Terminology-driven Augmentation of Bilingual Terminologies

This paper proposes a way of augmenting bilingual terminologies by using a “generate and validate” method. Using existing bilingual terminologies, the method generates “potential” bilingual multi-word term pairs and validates their status by searching web documents to check whether such terms actually exist in each language. Unlike most existing bilingual term extraction methods, which use para...

متن کامل

Evaluation of terminologies acquired from comparable corpora: an application perspective

This paper describes a protocol for the evaluation of bilingual terminologies acquired from comparable corpora. The aim of the protocol is to assess the terminologies’added-value in a task of specialized translation. The protocol consists in having specialized texts translated in various situations: without any specialized resource, with an domain-related bilingual terminology or using Internet...

متن کامل

Building bilingual terminologies from comparable corpora: the TTC TermSuite

In this paper, we exploit domain-specific comparable corpora to build bilingual terminologies. We present the monolingual term extraction and the bilingual alignment that will allow us to identify and translate high specialised terminology. We stress the huge importance of taking into account both simple and complex terms in a multilingual environment. Such linguistic diversity implies to combi...

متن کامل

AMethod of Augmenting Bilingual Terminology by Taking Advantage of the Conceptual Systematicity of Terminologies

In this paper, we propose a method of augmenting existing bilingual terminologies. Our method belongs to a “generate and validate” framework rather than extraction from corpora. Although many studies have proposed methods to find term translations or to augment terminology within a “generate and validate” framework, few has taken full advantage of the systematic nature of terminologies. A termi...

متن کامل

The Use of Terminological Theory Approach in the Development of Ontology Based Bilingual Terminology System on College Campus of Taiwan

For the majority college students in Taiwan, learning and using terminologies of a specific domain between Chinese and English interchangeably are quite a challenge. Most of the students seek for assistances from library resources or search for answers on web. Unfortunately, the students would not be able to identify the correctness of their findings, or the worse, the students cannot choose th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016